Regional climate contributes more than geographic distance to beta diversity of copepods (Crustacea Copepoda) between caves of Italy

Despite the study of subterranean biodiversity facing harsh sampling and mapping challenges, the huge diversity of taxa, ecological adaptations and evolutionary trajectories in subterranean environments is gaining increasing attention. Yet, the spatial and environmental factors driving the composition of groundwater communities are still poorly understood. To partially fill this knowledge gap, we collected copepod crustaceans from 12 caves along the Italian peninsula between 2019 and 2022, sampling each cave twice. The resulting presence-absence data were analysed to assess: (i) between-cave taxonomic beta diversity, also partitioning between turnover and nestedness-resultant dissimilarity; (ii) the relative weight of geographic distance and climatic differences in shaping observed beta diversity. Seventy-one species of copepods were collected overall. Pairwise beta diversity was high for most pairs of caves, with turnover being the major component. Geographic distance-decay models partially explained total beta diversity and turnover patterns. However, in Generalized Dissimilarity Models (GDM), including surface climatic conditions as predictors, the contribution of seasonal temperature averages was generally higher than that of geographic distance. Further, the explanatory and predictive performance of the GDMs notably increased, along with temperature contribution, when widening the spatial extent from which climate data were gathered. Our results confirmed a high spatial turnover in groundwater copepods’ assemblages and strengthened the link between regional climate and subterranean biodiversity.


Study sites
We selected twelve caves in diverse karst areas across the Italian peninsula, spanning multiple latitudinal bands (Fig. 1), and we sampled each cave twice between 2019 and 2022.
Abisso di Trebiciano and Grotta Antonio Federico Lindner caves are located along the underground course of the Timavo River 42 , the main contributor to the recharge of the Trieste Karst aquifer.During flood periods, the water rises several meters inside the two caves, so that their deepest portions become totally submerged.
Antro del Corchia cave is located on the southwestern slopes of the Apuan Alps and is part of one of Italy's largest karst systems 43 .Due to its huge overall length (dozens of kilometers) and depth (> 1000 m), and the consequent difficulties in displacing the sampling equipment, in this cave we focused the sampling on the habitats located close to the touristic path, which included an underground stream, some small lakes, pools, and dripping pools.On the southeastern side of the Apuan Alps are Grotta del Vento and Tana che Urla caves.Most of the former cave is characterized by unsaturated habitats, with an underground stream flowing only in its deepest part.Contrarily, Tana che Urla is crossed throughout its development by a perennial stream, without a well-developed unsaturated zone.
Grotta del Fiume has a sub-horizontal development, spanning 7 km and hosting several lakes and dripping pools, with both the saturated and unsaturated parts of the cavity being well-developed.
Grotte di Stiffe cave is traversed throughout by a perennial river, which has created meanders, waterfalls, and syphons (in the inner part of the cave).The unsaturated zone is primarily limited to one chamber, where there are some dripping pools.
Grotta Rotolo consists of wide vertical pits and sub-horizontal passages, reaching the water table at the bottom of the final drop 44 .Within the vadose zone, several pools are present at different heights.Geoportal, http:// www.pcn.minam biente.it/ mattm/ en/), with Italian administrative borders (black contour lines), the twelve sampled caves (black crosses) and their corresponding codes (see Table 1).
Table 1.For each sampled cave, the following data are reported: extended and abbreviated ("Cave code") denomination, coordinates (EPSG 4326: WGS84), elevation (at cave entrance, as derived from a 30 m-resolution DEM), lithological composition, horizontal and vertical extents.

Faunal sampling
First, we carefully identified the different habitat types within each cave, including pools, streams and lakes, gours (also termed "rimstone dams", barriers made up of calcite and other minerals being deposited around cave pools and streams).Then, we collected aquatic invertebrates using a stratified random sampling protocol, maximising sampling efforts within each habitat.For most caves, the sampling activity required advanced speleological techniques involving the use of ropes, harnesses and specific personal equipment for vertical progression in order to access the sampling sites safely.Sampling was conducted, whenever possible, in both the saturated and the unsaturated portions of the cave, and each habitat was sampled at least twice to obtain a representative sample of the corresponding invertebrate community.Faunal samples were collected and processed following a standardised protocol 47 , employing diverse sampling methods and devices based on the single habitat types.Specifically, we sampled dripping pools by aspirating water with a 50 mL syringe or manually sieving sediments using a plankton net (60 µm mesh).In areas showing intense dripping, we placed the net directly beneath the dripping area.We sampled the freeflowing water in subterranean streams and rivulets by placing the mouth of the net against the water flow and then sieving the sediment upstream of the net mouth to dislodge the benthic and interstitial species.We used a hand net (60 µm mesh) in deeper pools and where water flow was more substantial.Sampling within the phreatic zones was conducted using the same methods and equipment just described for the vadose zones, except in the phreatic zone of the Grave Rotolo cave where we employed a 60 µm-mesh Cvetkov net 48 .The Cvetkov net was also used for sampling the "Lago della Bottiglia" lake within Grotta del Fiume.
The collected specimens were preserved in 80% ethyl alcohol and stored in the Stygobiology Laboratory at the University of L' Aquila.The Crustacea Copepoda were then sorted under a Leica M205C stereomicroscope and identified to the species/subspecies level using taxonomic keys [49][50][51][52][53] and additional specialised literature.Each copepod species was further categorised as stygobite (obligate groundwater dweller) or non-stygobite (non-obligate groundwater species which are found occasionally or accidentally in groundwater).

Data analysis
All the statistical analyses were performed through the R software version 4.2.0 54 .

Estimating 'hidden' alpha diversity and between-cave beta diversity
To assess whether the set of species recorded in the sampled caves is likely to comprehensively represent the alpha diversity of Crustacea Copepoda occurring within the Italian karst groundwaters, we used three non-parametric estimators of species richness, namely 'Chao 2' 55 , 'first-order Jackknife' 56 , and 'Bootstrap' 57 , implemented within the 'specpool' function of the "vegan" package 58 version 2.6-2.Further, to visualise the relationship between the number of observed species and increasing sampling effort (here intended as the number of sampled caves), we generated a Species Accumulation Curve 59 for each of the three richness estimators, taking advantage of the 'specaccum' function from the "vegan" package.The accumulation curves were obtained through 1000 random permutations of the initial set of sites (i.e., caves) in pools of sites of increasing size (ranging from n = 1 to n = 12).
We estimated the between-cave dissimilarity of copepod assemblages by computing pairwise taxonomic beta diversity following the framework developed by Baselga 37,60 .Specifically, total pairwise beta diversity was partitioned into turnover and nestedness-resultant dissimilarity through the 'beta.pair'function implemented in the "betapart" 61 R package version 1.5.6.Turnover quantifies the extent to which the difference in species composition between two sites is due to the presence of distinct species at each site.Differently, nestednessresultant dissimilarity (hereafter, nestedness) rises when two sites share some species but one of the sites is richer in species than the other.The Sørensen dissimilarity index (β sor ) was used as a measure of total beta diversity, turnover was measured through the Simpson dissimilarity index (β sim ), while nestedness was measured through the corresponding index (β sne ) described in Baselga 37 .

Distance-decay models
To investigate the dependence of between-cave taxonomic dissimilarity upon geographic distance, we fitted distance-decay models for total beta diversity, and for its two components separately, through the 'decay.model'function of the betapart package.This function permits to contrast the observed biotic dissimilarity (response variable) between two sites to the corresponding geographic distance (predictor) via a Generalized Linear Model (GLM) with normal error distribution and logarithmic link function.) and on the p value resulting from the randomization algorithm implemented in the 'decay.model'function 39,61 , run across 1000 iterations.

Generalized dissimilarity models
Distance-decay models based on (generalized) linear regression have been also applied to assess whether the biotic dissimilarity between sites or regions is linked to the corresponding environmental dissimilarity 37,63 .In such applications, environmental distance between the contrasted sites/regions is usually measured as their distance along summary axes resulting from ordination techniques (e.g., Principal Component Analysis, Non-Metric Multidimensional Scaling) applied to environmental data.However, in recent years, a novel, more flexible, approach known as Generalized Dissimilarity Modelling 64 has gained momentum and is being more and more applied to investigate the drivers behind biotic dissimilarity at various spatial scales 65 .We took advantage of the "gdm" package 66 version 1.5.0-9.1 to fit Generalized Dissimilarity Models (GDMs) contrasting the observed between-cave beta diversity to both geographic and climatic distances.
Within GDMs, the biotic dissimilarity between sites is related to the inter-site spatial and environmental distance through a GLM with negative exponential link function (Eq.1): Where d ij is the pairwise biotic dissimilarity and η is a function of inter-site environmental distance termed "predicted ecological distance" 65 .Differently from the "traditional" distance-decay models, in GDMs the predicted ecological distance is modelled based on transformed values of the predictors, resulting from a linear combination of I-spline basis functions which allow higher flexibility in modelling non-linear relationships between biotic and spatial/environmental dissimilarity 65 .Without diving into technical details, for which we refer the reader to Mokany et al. 65 , it is worth mentioning that the degree of flexibility in the parametrization of GDMs is mainly linked to: (i) the number of I-spline functions applied for transforming each predictor; (ii) the number and location of the knots along the predictors' axes, where the knots define the portion of each predictor axis to which the single I-spline functions are applied.Here, we used the default parameters of the 'gdm' function from the homonymous package; specifically, three I-spline functions were used to transform each predictor, with the knots located at the minimum, median and maximum quantiles.
To quantify the climatic dissimilarity among the areas surrounding the 12 sampled caves, we took advantage of the Worldclim 2.1 online database 67 to download gridded climate surfaces (30 arc-seconds pixel resolution, ~ 1 km 2 ) representing the average values of the nineteen so-called "bioclimatic variables" (i.e., Bio1-Bio19) for the 1970-2000 period.These variables capture annual and seasonal trends related to temperature and precipitation.We chose as candidate predictors eight bioclimatic variables presumed to directly impact on subterranean environments 68 : Bio1 (Annual Mean Temperature), Bio3 (Isothermality), Bio10 (Mean Temperature of Warmest Quarter), Bio11 (Mean Temperature of Coldest Quarter), Bio12 (Annual Precipitation), Bio15 (Precipitation Seasonality), Bio18 (Precipitation of Warmest Quarter), and Bio19 (Precipitation of Coldest Quarter).
To assess whether climatic dissimilarity influences the between-cave taxonomic beta diversity in a scaledependent manner, we computed average values of the candidate climatic predictors within four circular buffers of increasing radius around the entrance of each cave: (i) 0.5 km; (ii) 2.5 km; (iii) 5 km; (iv) 10 km.For each buffer, we performed a Variance Inflation Factor (VIF) analysis on the corresponding climatic averages through the "usdm" R package 69 version 1.1-18, discarding the variables which attained a VIF score > 5 to avoid multicollinearity-related issues in model calibration 70 .Successively, we performed 1000 GDM fitting iterations: for each iteration, we randomly selected 85% of the 66 cave pairs (n = 56) to populate the training data table; these data were then used to calibrate a separate GDM for each combination of dissimilarity metric (i.e., β sor , β sim , β sne ) × buffer radius.Within each GDM, the dissimilarity metric represented the response variable, while between-cave geographic distance and between-cave differences in the average value of each retained climatic variable within the considered buffer represented the explanatory variables.We then assessed the explanatory performance of each model based on the corresponding explained deviance (on training data), and the model predictive performance by computing the Root Mean Squared Error (RMSE) of its predictions on the withheld cave pairs (n = 10).
For each fitted GDM, we also partitioned, through the 'gdm.partition.deviance'function, the portion of explained deviance attributable to three sets of variables (and pairwise combinations thereof): (i) "Dist", including only pairwise geographic distance; (ii) "Prec", comprising the precipitation-related variables Bio12, Bio15, Bio18 and Bio19; (iii) "Temp", including the temperature-related variables Bio3, Bio10 and Bio11 (Bio1 was discarded, after the VIF, for all the four buffer radii).
We then investigated possible statistically significant differences (in terms of explained deviance, RMSE and deviance partitioning) among the GDMs fitted on climatic averages from the four distinct buffer radii by performing Kruskal-Wallis Rank Sum Tests, followed by post-hoc one-tailed Pairwise Wilcoxon Rank Sum Tests with Benjamini & Hochberg 71 p value correction.We used non-parametric tests because preliminary data exploration 72 highlighted, for several dissimilarity metric × buffer radius combinations, the lack of fundamental assumptions of parametric tests (e.g., normality of residuals, homoscedasticity).For all these tests, the threshold for Type I error (i.e., "alpha-level") was set to p value < 0.05.Finally, we fitted a full-data GDM for each dissimilarity metric × buffer radius combination.From these full-data models, we inspected: (i) the distribution of predicted versus observed dissimilarity values; (ii) the relative importance of each explanatory variable, assessed through the permutation algorithm implemented in the 'gdm.VarImp' function; (iii) the I-spline curves corresponding to the top-three variables in terms of relative importance.
A flowchart summarising the workflow designed to conduct the set of analyses described above is shown in Fig. 2.

Results
Out of the 71 copepod species recorded in the 12 sampled caves, 46 were stygobites and 25 non-stygobites (Supplementary Materials Table S1).Among the non-stygobites, Paracyclops imminutus was by far the most widespread as it was recorded in 11 caves, followed by Bryocamptus echinatus which was found in 8 caves.Worth mentioning, 45 out of the 71 species were collected in a single cave each.Grotte di Pertosa-Auletta (Pe) showed the highest alpha diversity (22 species), followed by Grotte di Stiffe (St) with 18 species.Grotta Lindner (Li) and Tana che Urla (TcU) showed the lowest number of recorded species (5), followed by Grave Rotolo (Ro) and Grotta del Vento (GdV) hosting 6 species each.Some of the species being unique to single caves still lack scientific description, such as a new species of Hesperocyclops from Grave Rotolo and a new species of Pseudectinosoma from Grotta Puntore (Pu).
Despite the relatively high number of recorded species, all three non-parametric richness estimators suggested that more species may have been found if additional caves were sampled.Indeed, the ratio between observed and estimated richness was as low as 40.8% according to the Chao2 estimator which, however, also showed a high standard error (Supplementary Material Table S2); higher ratios between observed and estimated richness emerged for the first-order Jackknife (63.3%) and the Bootstrap (88.1%) estimators (Supplementary Material Table S2).The accumulation curves obtained using the three estimators were almost identical and none of them approached an asymptotic trend (Supplementary Material Fig. S1).
Total between-cave beta diversity was relatively high for most pairs of caves (Supplementary Material Table S3), with β sor values lower than 0.3 emerging only between the three caves located in the Apuan Alps [i.e., Antro del Corchia (Co), Grotta del Vento (GdV), and Tana che Urla (TcU)].Further, overall beta diversity (mean β sor ± SD = 0.64 ± 0.31; median β sor = 0.76) was mostly driven by turnover (mean β sim ± SD = 0.53 ± 0.29; median β sim = 0.60), with far lower nestedness-resultant dissimilarity (mean β sne ± SD = 0.11 ± 0.11; median β sne = 0.07).A preliminary visual comparison between dissimilarity values and geographic distances (Supplementary Material Tables S3, S4) revealed that relatively close caves may be noticeably dissimilar in terms of copepod assemblages, while caves located hundreds of kilometres apart may have quite a low species turnover.For instance, Abisso di Trebiciano (Tr) and Grotta Lindner (Li), which are just 15 km distant, showed a high beta diversity (β sor = 0.86) and, most notably, this was almost totally driven by turnover (β sim = 0.8).Differently, Grotte di Stiffe (St) and TcU, which are located 325 km apart, showed one of the lowest turnover values (β sim = 0.2), with beta diversity in this case being primarily driven by nestedness of the TcU assemblage within that of St (β sne = 0.48).
Both the negative exponential and the power-law distance-decay models showed statistically significant explanatory performance for both turnover and total beta diversity, but not for nestedness (Supplementary Material Table S5).The distance-decay curves resulting from the power-law model showed a better fit to observed values compared to the ones resulting from the negative exponential model, though both of them either underestimated or overestimated dissimilarity for some pairs of caves (Fig. 3).
Whatever the extent of the buffer over which the climatic averages were computed, the explanatory performance of the GDMs fitted with nestedness as response variable was far lower (median explained deviance < 10%) than that of the models fitted for turnover and total beta diversity (Fig. 4), thus mirroring the results of the distance-decay models.
Focusing on turnover and total beta diversity, buffer size also influenced the percentage of deviance explained by the different sets of variables (p < 0.001 in the Kruskal-Wallis Rank Sum Tests), except within the GDMs targeting total beta diversity and fitted upon "Dist" only (χ 2 = 5.141, DF = 3, p value = 0.16) and "Dist + Prec" (χ 2 = 0.982, DF = 3, p value = 0.806).Post-hoc tests confirmed significantly higher explained deviance using a 10 km-radius compared to the other three radii for the GDMs including temperature variables, namely those fitted only on "Temp", "Dist + Temp" and "Prec + Temp" (Supplementary Materials Table S7).Further, the GDMs targeting turnover and fitted only on "Temp" variables had higher explanatory performance than those fitted using only geographic distance or "Prec" variables, and their median explained deviance increased about twicefold when using a 10 km-radius buffer (Fig. 5).For total beta diversity, GDMs fitted only on "Dist" explained as much deviance as those fitted on "Temp", except when temperature averages were computed over a 10 km-radius buffer (Fig. 5).However, none of the models fitted on a single set of variables showed explained deviance higher than 15%.The GDMs fitted on coupled sets of variables generally explained less deviance than models including all the variables, but such difference was slight for "Dist + Temp" models fitted using the 10 km-radius buffer (Fig. 5).www.nature.com/scientificreports/GDMs fitted on all the 66 cave pairs (i.e., "full-data models") confirmed that computing climatic averages over a wider area around cave locations increased model fit on observations, for both turnover and total beta diversity.Indeed, Spearman's correlation between predicted and observed dissimilarity values increased from r = 0.4 with 0.5 km-radius buffer to r = 0.5 with 10 km-radius buffer for turnover, and from r = 0.32 with 0.5 kmradius buffer to r = 0.45 with 10 km-radius buffer for total beta diversity (Supplementary Materials Figs.S2, S4).Contrarily, the low correlation between predicted and observed values of nestedness further decreased when using the 10 km-radius buffer (Supplementary Materials Fig. S3)."Temp" variables (i.e., Bio10, Bio11) consistently emerged as the top-contributing ones, followed by geographic distance, within the full-data GDMs fitted with turnover as response variable (Table 2).Differently, geographic distance was the most influential predictor for total beta diversity, except within the full-data GDM fitted using the 10 km-radius buffer.In line with the results of the deviance partitioning analysis conducted on the 1000 partial GDMs (i.e., those fitted on 85% random samples of the cave pairs and validated on the remaining 15%), the relative importance of seasonal temperature averages noticeably increased within the full-data GDMs fitted using the 10 km-radius buffer, for both turnover and total beta diversity (Table 2).In the latter models, Bio11 was excluded from model fitting due to multicollinearity emerging in the VIF, then being replaced by Bio10 as the leading variable.The I-spline curves extracted from the full-data GDMs confirmed the strong contribution of temperature gradients to the predicted dissimilarity in terms of turnover (Fig. 6): indeed, an exponential growth in the modelled partial ecological distance emerged with increasing temperature for both Bio11 and Bio10, while the curves obtained for the other predictors showed logarithmic (for geographic distance) or sigmoidal (for "Prec" variables, i.e.Bio18 and Bio19) trends, with far lower maximum values.For total beta diversity, the I-spline drawn for geographic distance still showed a logarithmic trend but with a far steeper increase in partial ecological distance up to 300 km and a highest peak (Supplementary Materials Fig. S5).Nonetheless, changes along the gradient of Bio10 translated into increases in partial ecological distance far more rapidly than for geographic distance within the full-data model fitted with the 10 km-radius buffer (Supplementary Materials Fig. S5).

Discussion
The twelve caves showed a notable variety of copepod species.Species fully dependent on groundwater (i.e., stygobites) and species occasionally or accidentally entering cave waters from the surface (i.e., non-stygobites) were both well represented, though with the primacy of the obligate groundwater dwellers.Overall, we identified 71 copepod species in the collected samples and some of the sampled caves hosted more than 15 species.Nonetheless, the implemented richness estimators indicated that the set of recorded species was likely not exhaustive in terms of copepod alpha diversity across the investigated karst macro-areas.This is in line with the modern view of subterranean environments being far richer in life than previously thought 9,18,73 , and it further reinforces the recent claims of the scientific community about the urgency of filling the existing knowledge gaps about subterranean biodiversity 6,74 .Moreover, the high incidence of spot endemics (23 stygobitic species) and of rare-in terms of frequency of occurrence-species (5 stygobitic species) is likely the main factor that determined a long upward slope to the asymptote in the obtained species accumulation curves 75 .
Along with a high alpha diversity, we were able to highlight a striking between-cave taxonomic dissimilarity (i.e., beta diversity) which was driven mainly by pure species replacement, in line with what has been derived for the European groundwater fauna by Zagmajster et al. 76 , from the regional to the continental scale.The dominance of turnover over nestedness-resultant dissimilarity (i.e., dissimilarity due to ordered species loss) in shaping beta diversity is emerging in numerous other ecosystems across a wide variety of spatial scales.In a recent meta-analysis including 99 studies conducted in various regions across the globe, Soininen et al. 77 showed that available evidence points to a major role of turnover particularly at lower latitudes and with increasing extent of the studied area, while nestedness-resultant dissimilarity generally increases towards the poles.This trend is likely due to stronger environmental filtering at higher latitudes, driven by factors such as paleoclimatic events (e.g., Quaternary glaciations) and harsher current conditions (e.g., longer freezing periods and lower primary productivity).
Across the European continent, similar patterns were retrieved for a wide array of taxa, including beetles 37,78 , mammals 79 , parasitic trematodes 80 , aquatic plants and cladocerans 81 , freshwater fishes 82 , and groundwater crustaceans 33 .In such studies, the relative contribution of spatial (i.e., geographic distance) versus environmental (e.g., climate, land cover, soil composition) factors on the ratio between turnover and nestedness also varied depending on the target taxa.For instance, Heino et al. 78 found that overall beta diversity among northern European regions was similar between ground beetles and diving beetles; however, compositional dissimilarity linked to richness differences (corresponding to nestedness-resultant dissimilarity) was higher than turnover for the former group, and vice versa for the latter.For ground beetles, the authors linked the preponderance of nestedness, and the major contribution of geographic distance to the observed dissimilarity, to the Quaternary glaciations which have determined regional extinctions in northern areas and to biogeographical barriers hampering subsequent re-colonizations.Differently, the higher turnover emerging for diving beetles was more strongly related to temperature gradients than to spatial isolation, suggesting that environmental filtering rather than dispersal barriers shaped current beta diversity patterns for this group.In southern Europe, however, the dominance of pure species replacement has emerged for various taxa with widely diverging life-history traits, dispersal capability and evolutionary history 33,79,81 .This pattern has been generally attributed to the higher historical climatic stability of southern Europe compared to boreal regions, which would have favoured speciation events over long periods devoid of dramatic climatic changes causing mass extinctions.
Focusing on groundwater crustaceans, copepods included, Zagmajster et al. 33,76 confirmed higher turnover values at lower latitudes, a major contribution of turnover over nestedness to total beta diversity, and significant linear correlations (positive for turnover, negative for nestedness) with geographic distance.At a far finer spatial scale, a study conducted on copepod assemblages populating a set of thirty groundwater-fed springs from Central Italy 36 highlighted a high nestedness-resultant dissimilarity, which was also positively correlated to betweenspring geographic distance; such a correlation with geographic distance did not emerge, instead, for the turnover component.However, in both these studies, the authors did not implement any analytical tool permitting to explicitly discriminate the proportion of dissimilarity being purely attributable to geographic distance from that driven by environmental gradients.Table 2. Top-three variables, in terms of relative contribution, within the full-data GDMs fitted for turnover and total beta diversity using the four distinct buffer radii for spatial averaging of climate-related variables.The distance-decay models we implemented, using either a negative exponential function or a power-law function, showed that total beta diversity and turnover were significantly related to between-cave geographic distance, while nestedness was not.However, distance-decay curves did not perfectly fit the observed dissimilarity values, due to the fact that the relationship between geographic distance and faunal dissimilarity was not always www.nature.com/scientificreports/positive.Indeed, some caves far apart from each other shared more species of copepods than neighbouring caves.For instance, the Tana che Urla and the Grotte di Stiffe caves, which are 325 km apart, share 3 non-stygobitic species (Bryocamptus echinatus, Bryocamptus zschokkei and Paracyclops imminutus) and the wide-ranging stygobitic harpacticoid Elaphoidella phreatica, leading to a low turnover (β sim = 0.2).Contrarily, the Abisso di Trebiciano and the Grotta Lindner caves, belonging to the same hydrogeological unit (Classical Karst 83 ) and being closer than 20 km, shared the non-stygobitic species Bryocamptus echinatus only.In this case, a high between-cave turnover (β sim = 0.8) emerged because Abisso di Trebiciano hosted 3 stygobitic (Troglodiaptomus sketi, Diacyclops charon and Elaphoidella jeanneli) and 4 non-stygobitic (Acanthocyclops robustus, Macrocyclops albidus, Megacyclops viridis and Bryocamptus zschokkei) species which were not found in Grotta Lindner, while the latter hosted 3 stygobites (Acanthocyclops hypogeus, Elaphoidella elaphoides and Elaphoidella phreatica) and the non-stygobite Paracyclops imminutus which were absent from Abisso di Trebiciano.These examples suggest that the actual degree of between-cave species replacement is determined by both the stygobitic and the non-stygobitic species.Specifically, low species turnover between distant caves could be primarily related to non-stygobitic, often widely distributed, species that have established populations (most likely temporarily) within the caves under consideration.On the other hand, high turnover between close caves was largely driven by the stygobitic copepods, which are often unable to cross the rocky boundaries between even adjacent caves due to their low potential for dispersal, and thus usually respond to the rule "each species per each cave" 13,18,74 .Along with dispersal barriers related to the lithology of the areas where the caves open 14 , changes in groundwater faunistic assemblages over regional spatial scales could also derive from differences in the size, spatial arrangement and heterogeneity of the different habitat types and/or in the physical-chemical conditions (e.g., pH, electric conductivity) characterising distinct caves belonging to adjacent-or even to the same-hydrogeological units 84 .By fitting Generalized Dissimilarity Models (GDM) fed with information about surface climatic conditions around the sampled caves, we found that a major portion of the observed between-cave compositional dissimilarity is explained by a mix of spatial (i.e., geographic distance) and climatic factors.Indeed, quarterly temperature averages (i.e., Bio10 and Bio11) consistently emerged as the most influential explanatory variables in the GDMs fitted for between-cave turnover.With respect to total beta diversity, Bio10 was still by far the most influential variable when sampling climate data over a 10 km-buffer radius, but geographic distance showed a not negligible influence in the corresponding model.Therefore, environmental filtering, primarily related to thermal ranges, appeared to mainly determine the set of species being unique to the different caves, and thus betweencave turnover.This may derive from the combination of two processes: (i) differences in temperature averages between regions may translate into differences in the timing and amount of groundwater recharge following ice and snow melting, along with rainfall, which in turn affects the degree to which non-stygobitic copepods can move in the groundwater through the hydrologic continuum between surface water and groundwater 85 ; (ii) since temperature within caves is generally correlated to surface mean annual temperature 86 , temperature patterns in the neighbourhoods of the different caves may directly translate into thermal differences in cave waters, thus shaping the corresponding assemblages based on the thermal niche breadth of the species forming the respective regional pools 87,88 .
Further, we found that the extent over which climate averages were computed significantly influenced, for both turnover and total beta diversity, the goodness-of-fit of the GDMs, their predictive performance and the relative importance of temperature-related explanatory variables.Indeed, when increasing the radius of the buffer around the sampled caves up to 10 km, the proportion of explained deviance and the correlation between observed and predicted dissimilarity values increased, the RMSE of model predictions on the test data decreased, and Bio10 became by far the most influential variable.Additionally, the variance partitioning experiment clearly showed that temperature-related variables provided the greatest amount of information, when used in isolation or coupled with geographic distance, to model between-cave partial ecological distance.Again, the contribution of temperature-related variables within the GDMs fitted on subsets of the available explanatory variables significantly increased when using a 10 km-radius buffer.These patterns may be linked to the fact that the aquatic habitats of caves are replenished by surface rainfall and temperature-driven ice and snow melting over recharge areas which usually span dozens-to-hundreds of kilometres and that are often located far from the caves' openings 14,89 .Thus, extending the area from which data are acquired should provide more information about how differences in regional climate affect the aquifers' recharge in the different caves, and this in turn should increase our ability to isolate climatic correlates of between-cave biotic dissimilarity.Our findings also recall those from Keil et al. 90 who showed that, over regional extents (i.e., single countries or more restricted areas), climatic factors were more important than land cover and geographic distance in explaining pairwise species turnover for various taxonomic groups in Europe, and that the relative contribution of climate increased at coarser grid resolutions.The residual unexplained deviance in our models, namely the portion of between-cave beta diversity which could not be directly linked either to climatic averages or to geographic distance, suggests that the inclusion of some of the local-scale factors cited above (e.g., habitat size and heterogeneity, physical-chemical parameters) would be recommendable to fully understand the drivers behind the composition of crustacean assemblages in cave waters.However, while measuring differences in average physical-chemical parameters between distinct caves is rather straightforward, quantifying between-cave differences in habitat heterogeneity is a far more complex operation, also affected by subjectivity in classifying the single habitat patches within the compared caves.
In conclusion, this study highlighted that Italian karst caves harbour a rich and diverse array of copepod species, including several stygobitic and narrow-ranging species.Further, we demonstrated that copepod assemblages populating a set of caves along a latitudinal gradient spanning the Italian peninsula were noticeably dissimilar from one another, with turnover rather than nestedness driving most of the observed beta diversity.This is also important from a conservation perspective because the high species turnover implies that environmental degradation within a single cave may lead to the loss of unique species not occurring elsewhere, similarly to what has emerged from a recent study performed on groundwater-fed karst springs 41 .Additionally, we showed www.nature.com/scientificreports/ that geographical distance had a not negligible yet secondary role in shaping the between-cave dissimilarity of copepod assemblages in our studied caves, with surface temperature patterns having the greatest influence, particularly on species replacement.Finally, we showed that widening the spatial extent from which climatic data were retrieved in the surroundings of the target caves increased the ability of Generalized Dissimilarity Models to explain the observed beta diversity patterns, along with the relative importance of temperature-related variables.A limitation of the present study about this latter finding is that, by using circular buffers to compute climatic averages, we did not take into account the inherent anisotropy of the environmental gradients affecting the groundwater recharge dynamics (e.g., altitude and geomorphology influencing the direction from which water flows towards a given cave).Further, the possible presence of cryptic (i.e., with very similar phenotypes yet genetically distinct) groundwater-dwelling species may affect the estimates of between-cave beta diversity and its components, a risk which could be lowered by coupling molecular species delimitation methods to the morphology-and ecology-based taxonomic assessment 91 .Future studies replicating our analyses by taking into account these points, also focusing on caves from other regions, broader spatial extents and/or different subterranean taxa, would increase our understanding of the drivers behind the structuring and diversification of groundwater communities.

Figure 2 .
Figure 2. Flowchart summarising the different analytical steps taken to investigate the contribution of geographical distance and surface climatic conditions to between-cave taxonomic beta diversity.

Figure 3 .
Figure 3. Distance-decay curves resulting from the distance-decay models based on either a negative exponential function (left plots) or a power-law function (right plots), fitted for between-cave turnover, nestedness and total beta diversity.Black dots showed the observed beta diversity-geographic distance pairs of values.

Figure 4 .Figure 5 .
Figure 4. Half-violin plots showing variation in (a) explained deviance, and (b) RMSE on test data, extracted from the Generalized Dissimilarity Models fitted using climatic averages computed over four distinct buffer radii around cave locations (0.5 km, 2.5 km, 5 km and 10 km) as explanatory variables (along with between-cave geographic distance).The horizontal bar within each half-violin indicates the median.

Figure 6 .
Figure 6.I-spline curves extracted from the full-data GDMs fitted with turnover as response variable, and including as explanatory variables climatic averages computed over a buffer around the caves' entrance whose radius measured: (a) 0.5 km, (b) 2.5 km, (c) 5 km, and (d) 10 km.The curves show the relationship between modelled partial ecological distance and the values of the top-three variables in terms of relative contribution to the considered GDM.In each row, the order of the plots reflects the relative contribution of the variables, with the top-contributing variable on the left. https://doi.org/10.1038/s41598-023-48440-7 The Euclidean geographic distances